Quantitative Analysis of Stock Market¶

Stock market Quantitative Analysis is applied to understand, predict, and make decisions about financial investment based onmathematical and statistical techniques. We will go though the process of Quantitative Analysis of the stock market in this Notebook

Quantitative Analysis in the stock market is a financial methodology that utilizes mathematical and statistical techniques to analyze stocks and financial markets.

Here is the processto be followed during Quantitative Analysis of the stock market:

  1. Clearly define the objectives and questions to be answered.
  2. Identify the key performance indicators (KPIs) relevant to the analysis.
  3. Gather historical stock market data, including prices, volumes, and other relevant financial indicators.
  4. Clean and preprocess the data to handle missing values, outliers, and errors.
  5. Conduct initial analysis to understand data distributions, patterns, and correlations.
  6. Implement various strategies based on quantitative analysis.

To get started with this task, We can use the dataset fround here.

Let’s get started with the task of Quantitative Analysis of the stock market by importing the necessary Python libraries and the dataset:¶
In [1]:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.io as pio
pio.templates.default = "plotly_white"

# Load the dataset
stocks_data = pd.read_csv("stocks.csv")

# Display the first few rows of the dataset
print(stocks_data.head())
  Ticker        Date        Open        High         Low       Close  \
0   AAPL  2023-02-07  150.639999  155.229996  150.639999  154.649994   
1   AAPL  2023-02-08  153.880005  154.580002  151.169998  151.919998   
2   AAPL  2023-02-09  153.779999  154.330002  150.419998  150.869995   
3   AAPL  2023-02-10  149.460007  151.339996  149.220001  151.009995   
4   AAPL  2023-02-13  150.949997  154.259995  150.919998  153.850006   

    Adj Close    Volume  
0  154.414230  83322600  
1  151.688400  64120100  
2  150.639999  56007100  
3  151.009995  57450700  
4  153.850006  62199000  

The dataset explained:¶

  • Ticker: The stock ticker symbol.
  • Date: The trading date.
  • Open: The opening price of the stock for the day.
  • High: The highest price of the stock during the day.
  • Low: The lowest price of the stock during the day.
  • Close: The closing price of the stock for the day.
  • Adj Close: The adjusted closing price, which accounts for all corporate actions such as dividends, stock splits, etc.
  • Volume: The number of shares traded during the day.

To perform a quantitative analysis, we can visit various statistical concepts like descriptive statistics, time series analysis, correlation analysis, among others. Here are some potential analyses concepts we can utilise:

  • Descriptive Statistics: Summary statistics (mean, median, standard deviation, etc.) for each stock.
  • Time Series Analysis: Trends and patterns over time, especially for closing prices.
  • Volatility Analysis: How much the stock price fluctuates over a period.
  • Correlation Analysis: How stock prices of different companies are related to each other.
  • Comparative Analysis: Comparing the performance of different stocks.
  • Risk-Return Trade-off Analysis: Analyzing the balance between the potential risks and rewards of different stocks, aiding in portfolio management.

Let’s implement all these concepts of Quantitative Analysis of the stock market one by one.

Descriptive Statistics¶

Descriptive Statistics will provide summary statistics for each stock in the dataset. We’ll look at measures such as mean, median, standard deviation, and more for the Close prices:

In [2]:
# Descriptive Statistics for each stock
descriptive_stats = stocks_data.groupby('Ticker')['Close'].describe()

print(descriptive_stats)
        count        mean        std         min         25%         50%  \
Ticker                                                                     
AAPL     62.0  158.240645   7.360485  145.309998  152.077499  158.055000   
GOOG     62.0  100.631532   6.279464   89.349998   94.702501  102.759998   
MSFT     62.0  275.039839  17.676231  246.270004  258.742500  275.810013   
NFLX     62.0  327.614677  18.554419  292.760010  315.672493  325.600006   

               75%         max  
Ticker                          
AAPL    165.162506  173.570007  
GOOG    105.962503  109.459999  
MSFT    287.217506  310.649994  
NFLX    338.899994  366.829987  

Let's break down the meaning of these numbers:

  • Count: The number of observations or trading days included in the dataset for given ticker
  • Mean: The average closing price
  • Standard Deviation: Measures the amount of variation or dispersion of closing prices
  • Minimum: The lowest closing price in the dataset
  • 25th Percentile: 25% of the closing prices are below this value
  • Median (50%): The middle value of the closing prices
  • 75th Percentile: 75% of the closing prices are below this value
  • Maximum: The highest closing price in the dataset

Now let's visit each Ticker:

  1. AAPL (Apple Inc.)
    • Count: 62.0
    • Mean: 158.24
    • Standard Deviation: 7.36
    • Minimum: 145.31
    • 25th Percentile: 152.08
    • Median (50%): 158.06
    • 75th Percentile: 165.16
    • Maximum: 173.57
  1. GOOG (Alphabet Inc.) Similar statistics as AAPL, but for GOOG. The mean closing price is 100.63, with a standard deviation of 6.28, indicating less variability in closing prices compared to AAPL.

  2. MSFT (Microsoft Corporation) The dataset includes the same number of observations for MSFT. It has a higher mean closing price of 275.04 and a higher standard deviation of 17.68, suggesting greater price variability than AAPL and GOOG.

  3. NFLX (Netflix Inc.) NFLX shows the highest mean closing price (327.61) among these stocks and the highest standard deviation (18.55), indicating the most significant price fluctuation.

Time Series Analysis¶

Moving to Time Series Analysis to examine trends and patterns over time, focusing on the closing prices:

In [3]:
# Time Series Analysis
stocks_data['Date'] = pd.to_datetime(stocks_data['Date'])
pivot_data = stocks_data.pivot(index='Date', columns='Ticker', values='Close')

# Create a subplot
fig = make_subplots(rows=1, cols=1)

# Add traces for each stock ticker
for column in pivot_data.columns:
    fig.add_trace(
        go.Scatter(x=pivot_data.index, y=pivot_data[column], name=column),
        row=1, col=1
    )

# Update layout
fig.update_layout(
    title_text='Time Series of Closing Prices',
    xaxis_title='Date',
    yaxis_title='Closing Price',
    legend_title='Ticker',
    showlegend=True
)

# Show the plot
fig.show()

The plot shows the time series of the closing prices for each stock (AAPL, GOOG, MSFT, NFLX) over the given period. We can observe the following:

  • Trend: Each stock shows its unique trend over time. For instance, AAPL and MSFT exhibit a general upward trend in this period.
  • Volatility: There is noticeable volatility in the stock prices. For example, NFLX shows more pronounced fluctuations compared to others.
  • Comparative Performance: When comparing the stocks, MSFT and NFLX generally trade at higher price levels than AAPL and GOOG in this dataset.
Volatility Analysis¶

Moving to Volatility Analysis. We’ll calculate and compare the volatility (standard deviation) of the closing prices for each stock. It will give us an insight into how much the stock prices fluctuated over the period:

In [4]:
# Volatility Analysis
volatility = pivot_data.std().sort_values(ascending=False)

fig = px.bar(volatility,
             x=volatility.index,
             y=volatility.values,
             labels={'y': 'Standard Deviation', 'x': 'Ticker'},
             title='Volatility of Closing Prices (Standard Deviation)')

# Show the figure
fig.show()

The bar chart and the accompanying data display the volatility (measured as standard deviation) of the closing prices for each stock. Here’s how they rank in terms of volatility:

  • NFLX: Highest volatility with a standard deviation of approximately 18.55.
  • MSFT: Next highest, with a standard deviation of around 17.68.
  • AAPL: Lower volatility compared to NFLX and MSFT, with a standard deviation of about 7.36.
  • GOOG: The least volatile in this set, with a standard deviation of approximately 6.28.

It indicates that NFLX and MSFT stocks were more prone to price fluctuations during this period compared to AAPL and GOOG.

Correlation Analysis¶

Moving on to Correlation Analysis to understand how the stock prices of these companies are related to each other:

In [5]:
# Correlation Analysis
correlation_matrix = pivot_data.corr()

fig = go.Figure(data=go.Heatmap(
                    z=correlation_matrix,
                    x=correlation_matrix.columns,
                    y=correlation_matrix.columns,
                    colorscale='blues',
                    colorbar=dict(title='Correlation'),
                    ))

# Update layout
fig.update_layout(
    title='Correlation Matrix of Closing Prices',
    xaxis_title='Ticker',
    yaxis_title='Ticker'
)

# Show the figure
fig.show()

The heatmap above displays the correlation matrix of the closing prices of the four stocks. We can observe:

  • Values close to +1 indicate a strong positive correlation, meaning that as one stock’s price increases, the other tends to increase as well.
  • Values close to -1 indicate a strong negative correlation, where one stock’s price increase corresponds to a decrease in the other.
  • Values around 0 indicate a lack of correlation.

From the heatmap, we can observe that there are varying degrees of positive correlations between the stock prices, with some pairs showing stronger correlations than others. For instance, AAPL and MSFT seem to have a relatively higher positive correlation.

Comparative Analysis¶

Move on to Comparative Analysis. We will compare the performance of different stocks based on their returns over the given period. We’ll calculate the percentage change in closing prices from the start to the end of the period for each stock:

In [6]:
# Calculating the percentage change in closing prices
percentage_change = ((pivot_data.iloc[-1] - pivot_data.iloc[0]) / pivot_data.iloc[0]) * 100

fig = px.bar(percentage_change,
             x=percentage_change.index,
             y=percentage_change.values,
             labels={'y': 'Percentage Change (%)', 'x': 'Ticker'},
             title='Percentage Change in Closing Prices')

# Show the plot
fig.show()

The bar chart and the accompanying data show the percentage change in the closing prices of the stocks over the given period:

  • MSFT: The highest positive change of approximately 16.10%.
  • AAPL: Exhibited a positive change of approximately 12.23%. It indicates a solid performance, though slightly lower than MSFT’s.
  • GOOG: Showed a slight negative change of about -1.69%. It indicates a minor decline in its stock price over the observed period.
  • NFLX: Experienced the most significant negative change, at approximately -11.07%. It suggests a notable decrease in its stock price during the period.
Daily Risk Vs. Return Analysis¶

Finally, To perform a Risk vs. Return Analysis, we will calculate the average daily return and the standard deviation of daily returns for each stock. The standard deviation will serve as a proxy for risk, while the average daily return represents the expected return.

We will then plot these values to visually assess the risk-return profile of each stock. Stocks with higher average returns and lower risk (standard deviation) are generally more desirable, but investment decisions often depend on the investor’s risk tolerance:

In [7]:
daily_returns = pivot_data.pct_change().dropna()

# Recalculating average daily return and standard deviation (risk)
avg_daily_return = daily_returns.mean()
risk = daily_returns.std()

# Creating a DataFrame for plotting
risk_return_df = pd.DataFrame({'Risk': risk, 'Average Daily Return': avg_daily_return})

fig = go.Figure()

# Add scatter plot points
fig.add_trace(go.Scatter(
    x=risk_return_df['Risk'],
    y=risk_return_df['Average Daily Return'],
    mode='markers+text',
    text=risk_return_df.index,
    textposition="top center",
    marker=dict(size=10)
))

# Update layout
fig.update_layout(
    title='Risk vs. Return Analysis',
    xaxis_title='Risk (Standard Deviation)',
    yaxis_title='Average Daily Return',
    showlegend=False
)

# Show the plot
fig.show()

based on the above scater chart:

  • AAPL shows the lowest risk combined with a positive average daily return, suggesting a more stable investment with consistent returns.
  • GOOG has higher volatility than AAPL and, on average, a slightly negative daily return, indicating a riskier and less rewarding investment during this period.
  • MSFT shows moderate risk with the highest average daily return, suggesting a potentially more rewarding investment, although with higher volatility compared to AAPL.
  • NFLX exhibits the highest risk and a negative average daily return, indicating it was the most volatile and least rewarding investment among these stocks over the analyzed period.

Summary¶

We have gone through the process of performing a Quantitative Analysis of the Stock Market using Python. Quantitative Analysis in the stock market is a financial methodology that utilizes mathematical and statistical techniques to analyze stocks and financial markets

In [ ]: